Skip to content

fix(sqs): make visibility timeout and batch size configurable#6

Open
hamstu wants to merge 2 commits intomainfrom
fix/sqs-visibility-timeout-and-batch-size
Open

fix(sqs): make visibility timeout and batch size configurable#6
hamstu wants to merge 2 commits intomainfrom
fix/sqs-visibility-timeout-and-batch-size

Conversation

@hamstu
Copy link

@hamstu hamstu commented Mar 19, 2026

Summary

  • The SQS client had a hardcoded VisibilityTimeout of 36,000 seconds (10 hours), which directly caused recurring PagerDuty alerts on the elasticsearch_indexer_k8s queue (ApproximateAgeOfOldestMessage climbing to ~10h)
  • Also hardcoded MaxNumberOfMessages: 1, limiting throughput to 1 message per receive call
  • Adds functional options pattern so callers can configure per-queue settings without breaking existing consumers
  • New ReceiveBatch() method returns all fetched messages; existing Receive() stays backwards-compatible

Changes

Setting Before After (default) Configurable?
VisibilityTimeout 36,000s (10h) 120s Yes, via WithVisibilityTimeout()
MaxNumberOfMessages 1 10 Yes, via WithMaxMessages()
WaitTimeSeconds 20 20 Yes, via WithWaitTimeSeconds()

Backwards Compatibility

  • NewClient() accepts variadic ...Option — existing callers passing 3 args still compile
  • Receive() still returns a single message (delegates to ReceiveBatch internally)
  • Default visibility timeout changed from 10h to 120s — this is the intended fix. Any consumer relying on 10h visibility was likely broken

Test plan

  • Verify reply-mongo-to-es worker still compiles and works with the new defaults
  • Deploy publish-es-indexer with the updated library and confirm the SQS age-of-oldest-message alert resolves
  • Monitor that messages are processed and deleted within the 120s window

🤖 Generated with Claude Code

Summary by CodeRabbit

  • New Features
    • Configurable receive parameters for message handling: visibility timeout, batch size, and wait time with sensible defaults and input clamping.
    • Option-style configuration when creating the client to customize receive behavior.
    • Added batch message retrieval alongside existing single-message receive.
    • Centralized receive logic with improved error wrapping and consistent behavior.

The SQS client had a hardcoded VisibilityTimeout of 36000 seconds (10
hours), which caused the ApproximateAgeOfOldestMessage metric to climb
to ~10h before messages expired from in-flight status. This directly
triggered recurring PagerDuty alerts on the elasticsearch_indexer_k8s
queue.

Changes:
- Add functional options pattern (WithVisibilityTimeout,
  WithMaxMessages, WithWaitTimeSeconds) so callers can configure
  per-queue settings
- Change defaults to sensible values: 120s visibility, 10 max messages,
  20s wait time
- Add ReceiveBatch() method that returns all messages (up to max)
- Keep Receive() backwards-compatible (returns first message only)
- Existing callers that pass no options get the new defaults
  automatically

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
@coderabbitai
Copy link

coderabbitai bot commented Mar 19, 2026

Walkthrough

The client adds a functional options pattern (Option plus WithVisibilityTimeout, WithMaxMessages, WithWaitTimeSeconds) and accepts opts ...Option in the constructor. Defaults are visibilityTimeout=120, maxMessages=10, waitTimeSeconds=20; option helpers clamp inputs to valid ranges. Receive logic was centralized in receiveMessages; Receive returns the first message (or nil) by delegating to that helper, while ReceiveBatch returns the full slice of received messages using the configured parameters. AWS errors from receives are wrapped with a consistent message.

🚥 Pre-merge checks | ✅ 2
✅ Passed checks (2 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title accurately describes the main changes: making SQS visibility timeout and batch size (maxMessages) configurable via options pattern, which is the primary objective.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

📝 Coding Plan
  • Generate coding plan for human review comments

Comment @coderabbitai help to get the list of available commands and usage tips.

Tip

CodeRabbit can suggest fixes for GitHub Check annotations.

Configure the reviews.tools.github-checks setting to adjust the time to wait for GitHub Checks to complete.

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@sqs/sqs.go`:
- Around line 81-107: The Receive method currently calls ReceiveBatch which
requests up to c.maxMessages from SQS then returns only the first message,
causing other messages to be hidden and lost; fix by making Receive perform a
true single-message poll (set MaxNumberOfMessages=1) rather than relying on
ReceiveBatch — either implement Receive to call the SQS ReceiveMessage API with
MaxNumberOfMessages=1 (or add a private helper that takes maxMessages and use it
for both Receive and ReceiveBatch) so Receive returns at most one message while
leaving ReceiveBatch behavior unchanged (reference Receive, ReceiveBatch,
c.maxMessages, and the SQS ReceiveMessage call).
- Around line 23-50: WithVisibilityTimeout and WithWaitTimeSeconds currently
accept any int64 and should validate/clamp to AWS SQS limits like
WithMaxMessages does; update the option closures for WithVisibilityTimeout to
clamp seconds to the 0–43200 range before assigning to Client.visibilityTimeout,
and update WithWaitTimeSeconds to clamp seconds to the 0–20 range before
assigning to Client.waitTimeSeconds so invalid configs are caught early (follow
the same pattern used in WithMaxMessages).
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: a717557d-14f8-4012-99c7-cd6e00812924

📥 Commits

Reviewing files that changed from the base of the PR and between 8eb82fd and 6721df9.

📒 Files selected for processing (1)
  • sqs/sqs.go

- Receive() was delegating to ReceiveBatch() which fetches up to
  maxMessages, then returning only the first. The remaining messages
  became invisible for the visibility timeout but were never deleted.
  Now uses a private receiveMessages(1) helper so only 1 message is
  fetched from SQS.
- WithVisibilityTimeout now clamps to 0–43200 (AWS SQS limit).
- WithWaitTimeSeconds now clamps to 0–20 (AWS SQS limit).
- Follows the same validation pattern already used in WithMaxMessages.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧹 Nitpick comments (1)
sqs/sqs.go (1)

25-34: Move parameter clamping outside closures to eliminate mutation of captured values.

The three Option builders (WithVisibilityTimeout, WithMaxMessages, WithWaitTimeSeconds) mutate captured parameters inside their closures. While the current usage pattern is sequential, this design creates a potential concurrency hazard if an Option instance is ever reused or shared across goroutines. Clamp the parameters before returning the closure instead.

Also applies to: lines 40–48, 54–62

♻️ Proposed refactor
 func WithVisibilityTimeout(seconds int64) Option {
+	if seconds < 0 {
+		seconds = 0
+	}
+	if seconds > 43200 {
+		seconds = 43200
+	}
 	return func(c *Client) {
-		if seconds < 0 {
-			seconds = 0
-		}
-		if seconds > 43200 {
-			seconds = 43200
-		}
 		c.visibilityTimeout = seconds
 	}
 }

 func WithMaxMessages(n int64) Option {
+	if n < 1 {
+		n = 1
+	}
+	if n > 10 {
+		n = 10
+	}
 	return func(c *Client) {
-		if n < 1 {
-			n = 1
-		}
-		if n > 10 {
-			n = 10
-		}
 		c.maxMessages = n
 	}
 }

 func WithWaitTimeSeconds(seconds int64) Option {
+	if seconds < 0 {
+		seconds = 0
+	}
+	if seconds > 20 {
+		seconds = 20
+	}
 	return func(c *Client) {
-		if seconds < 0 {
-			seconds = 0
-		}
-		if seconds > 20 {
-			seconds = 20
-		}
 		c.waitTimeSeconds = seconds
 	}
 }
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@sqs/sqs.go` around lines 25 - 34, The Option builders (WithVisibilityTimeout,
WithMaxMessages, WithWaitTimeSeconds) currently clamp their input parameters
inside the returned closure which mutates captured variables and can cause
concurrency hazards if the Option is reused; move the clamping logic out of the
closure by normalizing the input (apply min/max limits) before returning the
func(c *Client) closure so the closure only assigns the already-clamped value to
fields (visibilityTimeout, maxMessages, waitTimeSeconds) without mutating
captured variables.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Nitpick comments:
In `@sqs/sqs.go`:
- Around line 25-34: The Option builders (WithVisibilityTimeout,
WithMaxMessages, WithWaitTimeSeconds) currently clamp their input parameters
inside the returned closure which mutates captured variables and can cause
concurrency hazards if the Option is reused; move the clamping logic out of the
closure by normalizing the input (apply min/max limits) before returning the
func(c *Client) closure so the closure only assigns the already-clamped value to
fields (visibilityTimeout, maxMessages, waitTimeSeconds) without mutating
captured variables.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: b0af96f0-55d2-48f7-a695-a5efcdca0246

📥 Commits

Reviewing files that changed from the base of the PR and between 6721df9 and 3fae988.

📒 Files selected for processing (1)
  • sqs/sqs.go

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant